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Abstract 

We study a constrained version of the knapsack problem in which dependencies 
between items are given by the adjacencies of a graph. In the 1-neighhour knapsack 
problem^ an item can be selected only if at least one of its neighbours is also selected. In 
the all-neighbours knapsack problem^ an item can be selected only if all its neighbours 
are also selected. 

We give approximation algorithms and hardness results when the nodes have both 
uniform and arbitrary weight and profit functions, and when the dependency graph is 
directed and undirected. 
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1 Introduction 



We consider the knapsack problem in the presence of constraints. The input is a graph 
G = (y, E) where each vertex v has a weight w{v) and a profit p{v)^ and a knapsack of size k. 
We start with the usual knapsack goal — find a set of vertices of maximum profit whose total 
weight does not exceed k — but consider two natural variations. In the 1 -neighbour knapsack 
problem^ a vertex can be selected only if at least one of its neighbours is also selected (vertices 
with no neighbours can always be selected). In the all-neighbour knapsack problem a vertex 
can be selected only if all its neighbours are also selected. 

We consider the problem with general (arbitrary) and uniform {p{v) = w{v) = 1 ^v) 
weights and profits, and with undirected and directed graphs. In the case of directed graphs, 
the constraints only apply to the oi/i-neighbours of a vertex. 

Constrained knapsack problems have applications to scheduling, tool management, in- 
vestment strategies and database storage P [H [8] . There are also applications to network 
formation. For example, suppose a set of customers C C 1/ in a network G = {V^ E) wish 
to connect to a server, represented by a single sink s ^ V . The server may activate each 
edge at a cost and each customer would result in a certain profit. The server wishes to 
activate a subset of the edges with cost within the server's budget. By introducing a vertex 
mid-edge with zero-profit and weight equal to the cost of the edge and giving each customer 
zero- weight, we convert this problem to a 1-neighbour knapsack problem. 

1.1 Results 

We show that the eight resulting problems 

{1-neighbour, all-neighbours} x {general, uniform} x {undirected, directed} 

vary in complexity but aflFord several algorithmic approaches. We summarize our results for 
the 1-neighbour knapsack problem in Table[l} In addition, we show that uniform, directed all- 
neighbour knapsack has a PTAS but is NP-complete. The general, undirected all-neighbour 
knapsack problem reduces to 0-1 knapsack, so there is a fully-polynomial time approximation 
scheme. 

In Section [2] we describe a greedy algorithm that applies to the general 1-neighbour 
problem for both directed and undirected dependency graphs. The algorithm requires two 
oracles: one for finding a set of vertices with high profit and another for finding a set of 
vertices with high profit-to- weight ratio. In both cases, the total weight of the set cannot 
exceed the knapsack capacity and the subgraph defined by the vertices must adhere to a strict 
combinatorial structure which we define later. The algorithm achieves an approximation 
ratio of (q{/2) • (1 — 1/e^). The approximation ratios of the oracles determines the a and /5 
terms respectively. 

For the general, undirected 1-neighbour case, we give polynomial-time oracles that achieve 
a = /3 = (1 — 5) for any 5 > 0. This yields a polynomial time ((1 — e)/2) • (1 — l/e^~^)- 
approximation. We also show that no approximation ratio better than 1 — 1/e is possible 
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Upper Lower 



Uniform 



General 



Undirected 


linear-time exact 


Directed 


PTAS 


NP-hard (strong sense) 


Undirected 


^ • (1 - 


1 - 1/e + e 


Directed 


open 


l/r2(log^-" n) 



Table 1: l-Neighbour Knapsack Problem results: upper and lower bounds on the approxi- 
mation ratios for combinations of {general, uniform} x {undirected, directed}. For uniform, 
undirected, the bounds are running-times of optimal algorithms. 

(assuming P^^NP). This matches the upper bound up to (almost) a factor of 2. These results 
appear in Section [2T| 



In Section |2.2[ we show that the general, directed 1-neighbour knapsack problem is 
l/r2(log''^~^ n)-hard to approximate, even in DAGs. 

In Section [3] we show that the uniform, directed 1-neighbour knapsack problem is NP- 
hard in the strong sense but that it has a polynomial-time approximation scheme (PTAS)[^ 
Thus, as with general, undirected 1-neighbour problem, our upper and lower bounds are 
essentially matching. 

In Section |4] we show that the uniform, undirected 1-neighbour knapsack problem affords 
a simple, linear-time solution. 

In Section [5] we show that uniform, directed all-neighbour knapsack has a PTAS but is 
NP-complete. We also discuss the general, undirected all-neighbour problem. 

1.2 Related work 

There is a tremendous amount of work on maximizing submodular functions under a single 
knapsack constraint [Hj , multiple knapsack constraints jT2] , and both knapsack and matroid 
constraints [T3l [4]. While our profit function is submodular, the constraints given by the 
graph are not characterized by a matroid (our solutions, for example, are not closed down- 
ward). Thus, the 1-neighbour knapsack problem represents a class of knapsack problems 
with realistic constraints that are not captured by previous work. 



As we show in Section [2. 1.2[ the general, undirected 1-neighbour knapsack problem gen- 
eralizes several maximum coverage problems including the budgeted variant considered by 
KhuUer, Moss, and Naor [lOj which has a tight (1 — l/e)-approximation unless P=NP. Our 
algorithm for the general 1-neighbour problem follows the approach taken by KhuUer, Moss, 
and Naor but, because of the dependency graph, requires several new technical ideas. In par- 
ticular, our analysis of the greedy step represents a non-trivial generalization of the standard 



PTAS is an algorithm that, given a fixed constant £ < 1, runs in polynomial time and returns a 
solution within 1 — e oi optimal. Tiie algorithm may be exponential in l/e 
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greedy algorithm for submodular maximization. 

Johnson and Niemi [8| give an FPTAS for knapsack problems on dependency graphs 
that are in-arborescences (these are directed trees in which every arc is directed toward a 
single root). In their problem formulation, the constraints are given as out-arborescences — 
directed trees in which every arc is directed away from a single root — and feasible solutions 
are subsets of vertices that are closed under the predecessor operation. This problem can be 
viewed as an instance of the general, directed 1-neighbour knapsack problem. 

In the subset-union knapsack problem (SUKP) each item is a subset of a ground 
set of elements. Each element in the ground set has a weight and each item has a profit 
and the goal is to find a maximum-profit set of elements where the weight of the union of 
the elements in the sets fits in the knapsack. It is easy to see that this is a special case of 
the general, directed all-neighbours knapsack problem in which there is a vertex for each 
item and each element and an arc from an item to each element in the item's set. In [9j, 
Kellerer, Pferschy, and Pisinger show that SUKP is NP-hard and give an optimal but badly 
exponential algorithm. The precedence constrained knapsack problem [Ij and partially- 
ordered knapsack problem [TT] are special cases of the general, directed all-neighbours knap- 
sack problem in which the dependency graph is a DAG. Hajiaghayi et. al. show that the 
partially-ordered knapsack problem is hard to approximate within a 2^^^ ^ factor unless 
3SATgDTIME(2^'^'^') [5J. 

1.3 Notation. 

We consider graphs G with n vertices V{G) and m edges E{G). Whether the graph is directed 
or undirected will be clear from context and we refer to edges of directed graphs as arcs. For 
an undirected graph, Ng{v) denotes the neighbours of a vertex v in G. For a directed graph, 
Ng{v) denotes the out-neighbours of v in G, or, more formally, Ng{v) = {u : vu G E{G)}. 
Given a set of nodes X, Nq{X) is the set of nodes not in X but that have a neighbour 
(or out-neighbour in the directed case) in X. That is, Nq{X) = {u : uv G E{G)^u 
X, and V G X}. The degree (in undirected graphs) and out-degree (in directed graphs) of 
a vertex in G is denoted 8g{v). The subscript G will be dropped when the graph is clear 
from context. For a set of vertices or edges [/, G\U] is the graph induced on U. 

For a directed graph G, V is the directed, acyclic graph (DAG) resulting from contracting 
maximal strongly-connected components (SCCs) of G. For each node u G V{V)^ let V{u) 
be the set of vertices of G that are contracted to obtain u. 

For a vertex let descG(2i) be the set of all descendants of u in G, i.e., all the vertices 
in G that are reachable from u (including u). A vertex is its own descendant, but not its 
own strict descendant. 

For convenience, extend any function / defined on items in a set X to any subset A C X 
by letting f{A) = EaGA/(^)- If /(^) a set, then f{A) = [JaeAfi^)- ^ / defined over 
vertices, then we extend it to edges: f{E) = f{V{E)). For any knapsack problem, OPT is 
the set of vertices/items in an optimal solution. 
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Figure 1: An undirected graph. If 1-L is the family of star graphs, then the shaded regions give 
the only viable partition of the nodes — no other partition yields 1-neighbour sets. However, 
every edge viable with respect to %. The singleton node is also viable since it is a 1-neighbour 
set for the graph. 

1.4 Viable Families and Viable Sets. 

A set of nodes [/ is a 1-neighbour set for G if for every vertex G t/, |A^g[^]('^)| > 
mm{SG{v)^ 1}. That is, a 1-neighbour set is feasible with respect to the dependency graph. 
A family of graphs is a viable family for G if, for any subgraph of G, there exists a 
partition yn{G') of G' into 1-neighbour sets for G', such that for every Y G yn{G')^ there 
is a graph H G Ti spanning G[Y]. For directed graphs, we take spanning to mean that H is 
a directed subgraph of G[Y] and that Y and H contain the same number of nodes. For a 
graph G, we call 3^7^(G) a viable partition of G with respect to H. 

In Section |2.1| we show that star graphs form a viable family for any undirected depen- 
dency graph. That is, we show that any undirected graph can be partitioned into 1-neighbour 
sets that are stars. Fig. [T] gives an example. In contrast, edges do not form a viable family 
since, for example, a simple path with 3 nodes cannot be partitioned into 1-neighbour sets 
that are edges. For DAGs, in-arborescences are a viable family but directed paths are not 
(consider a directed graph with 3 nodes u^v^w and two arcs (u^v) and (w^v)). Note that 
every vertex must be included as a set on its own in any viable family. 

A 1-neighbour set U for G is viable with respect to H if there is a graph H spanning 
G[U]. Note that the 1-neighbour sets in y^(G) are, by definition, viable for G, but a viable 
set for G need not be in 3^^(G). For example, if H is the family of stars and G is the 
undirected graph in Fig. [T} then any edge is a viable set for G but the only viable partition 
is the shaded region. Note that if [/ is a viable set for G then it is also a viable set for any 
subgraph G' of G provided U C V{G'). 

Viable families and viable sets play an essential role in our greedy algorithm for the 
general 1-neighbour knapsack problem. Viable families establish a set of structures over 
which our oracles can search. This restriction simplifies both the design and analysis of 
efficient oracles as well as coupling the oracles to a shared family of graphs which, as we'll 
show later, is essential to our analysis. In essence, viable families provide a mechanism to 
coordinate the oracles into returning sets with roughly similar structure. Viable sets correctly 
capture the idea of an indivisible unit of choice in the greedy step. We formalize this with 
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Figure 2: An undirected G in (a) and a directed graph G in (b) with 1-neighbour sets A 
(dark shaded) and B (dotted) marked in both. Similarly, in both (a) and (b) the lightly 
shaded regions give viable partitions for G[A\B] and the white nodes denote Nq{B). In (a) 
Y2 is viable for G[A \ 5], and since \Y2\ = 2, it is viable for G\V{G) \B]. Yi is not viable for 
G[V{G) \ B] but it is in N^{B). In (b), Y^ is viable in G[V{G) \ B] whereas Y^ is a viable 
because we consider G[V{G) \ B] with the dotted arc removed. 

the following lemma which is illustrated in Fig. [2j 

Lemma 1. Let G be a graph and H be a viable family for G. Let A and B be 1-neighbour 
sets for G. Ifyn{C) is a viable partition ofG[C] where C = A\B then every setY G yn{C) 
is either (i) a singleton node y such that y G Nq{B) (i.e.^ y has a neighbour in B), or (ii) 
a viable set for G' , which is the subgraph obtained by deleting vertices in B and arcs in X 
where X is empty if G is undirected and X is the set of arcs with tails in Nq{B) if G is 
directed. 

Proof. If |y| = 1 then let Y = {y}. If Sciy) = then y is a viable set for G so it is viable 
set for G^ Otherwise, since A is a 1-neighbour set for G, y must have a neighbour in B so 
y G Nq{B). If |y| > 1 then, provided G is undirected, Y is also a viable set in G so it is a 
viable set in G^ If G is directed and Y contains a node y that is in Nq{B)^ an arc out of y 
is not needed for feasibility since y already has a neighbour in A. 



2 The general 1-neighbour knapsack problem 

Here we give a greedy algorithm Greedy- 1-Neighbour for the general 1-neighbour knap- 
sack problem on both directed and undirected graphs. A formal description of our algorithm 
is available in Fig. (S) GreedyI-Neighbour relies on two oracles Best-Profit- Viable 
and Best-Ratio- Viable which find viable sets of nodes with respect to a fixed viable 
family H. In each iteration i, we call Best- Ratio- Viable which, given the nodes not 
yet chosen by the algorithm, returns the highest profit-to- weight ratio, viable set 5^^ with 



□ 
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Greedy- 1-Neighbour(G, fc) : 

S'max = BEST-PROFIT- VIABLE(G, k) 

K ^ k, U ^ i = 1, ^ G, Z ^ (D 

WHILE there is either a viable set in or a node in Z with weight < K 
Si = BEST-RATIO- VIABLE(G^ X) 
Si = SiTgmSix{p{v)/w{v) \ v G Z} 
IF p{si)/wisi) > p{Si)/w{Si) 

Si = {s,} 
G' = G[V{G') \ Si] 

i = i + l, U = U[J V{Si), K = K- w{Si) 
Z = Na{U) 

If G is directed, remove any arc in G' with a tail in Z 
RETURN argmax{p(5„,ax),p(f/)} 



Figure 3: The Greedy- 1-Neighbour algorithm. In each iteration z, we greedily add 
either the viable set Si or the node Si to our knapsack U depending on which has higher 
profit-to-weight ratio. This continues until we can no longer add nodes to the knapsack. 

weight not exceeding the remaining capacity. We also consider the set of nodes Z not in the 
knapsack, but with at least one neighbour already in the knapsack. Let Si be the node with 
highest profit-to-weight ratio in Z not exceeding the remaining capacity. We greedily add 
either Si or 5^^ to our knapsack U depending on which has higher profit-to-weight ratio. We 
continue until we can no longer add nodes to the knapsack. 

For a viable family if we can efficiently approximate the highest profit-to- weight 
ratio viable set to within a factor of /? and if we can efficiently approximate the highest 
profit viable set to within a factor of a, then our greedy algorithm yields a polynomial time 
|(1 — l/e^)-approximation. 

Theorem 2. Greedy- 1-Neighbour is a ^{1 — ^)- approximation for the general 1-neighhour 
problem on directed and undirected graphs. 

Proof. Let OPT be the set of vertices in an optimal solution. In addition, let Ui = ^^^1/(5'^) 
correspond to U after the first i iterations where Uq — ^. Let £ + 1 be the first iteration in 
which there is either a node in Z D OPT or a viable set in OPT \U£ whose profit-to- weight 
ratio is larger than S'^+i. Of these, let Si^i be the node or set with highest profit-per- weight. 
For convenience, let Si = Si and Ui = Ui for i = 1 . . . £^ and Ui^i = UiU Notice that 

Ui is a feasible solution to our problem but that Z/^+i is not since it contains Si^i which has 
weight exceeding K. We analyze our algorithm with respect to W^+i. 

Lemma 3. For each iteration z = the following holds: 

p{Si) > (piOPT) - p{Ui.,)) 
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Proof. Fix an iteration i and let / be the graph induced by OPT \ Ui-i. Since both OPT 
and Ui-i are 1-neighbour sets for G, by Lemma [l} each Y G yn{I) is either a viable set for 
G' (so it can be selected by BEST-RATIO- viable) or a singleton vertex in NQ{Ui-i) (which 
Greedy- 1-Neighbour always considers). Thus, if i < ^, then by the greedy choice of the 
algorithm and approximation ratio of BEST-RATIO- viable we have 



(1) 



w{Si) -^^wiY) 

If i = i + 1 then p{Se+i) / w(Si+i) is, by definition, at least as large as the profit-to- weight 
ratio of any Y e y. It follows that iov i = 1, ...,£ + 1: 



p(OPT) - p{Ui. 



Yeynil) 



< 



< 



< 



1 k 



E w{Y), byEq. Q 



^(OPT), since / is a subset of OPT 
p{Si), since w(OPT) < k 



Rearranging gives Lemma |3| 

Lemma 4. For i = 1, ...,£ + 1, the following holds: 



□ 



1 



n 



1-/3- 



k 



p(OPT) 



Proof. We prove the lemma by induction on i. For i = 1, we need to show that 

piUi) > /5^p(0PT). 



(2) 

This follows immediately from Lemma [3] since p{Uq) = and Ui = Si. Suppose the lemma 
holds for iterations 1 through i — 1. Then it is easy to show that the inequality holds for 
iteration i by applying Lemma [3] and the inductive hypothesis. This completes the proof of 
Lemma [H □ 

We are now ready to prove Theorem [2| Starting with the inequality in Lemma |4] and 
using the fact that adding S^^i violates the knapsack constraint (so w{U£^i) > k) we have 



> 



> 



i-n 

i=i 



/3 



k 

e+1 



j9(0PT) 
j9(0PT) 



i + 1 



p(OPT) > ( 1 - ^ ) P(OPT) 
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where the penultimate inequahty foUows because equal w{Sj) maximize the product. Since 
S'max is within a factor of a of the maximum profit viable set of weight < k and Si^i is 
contained in OPT, p(S'max) > a-p{Si+i). Thus, we h^ye p{U)+p{Srm.^)/a > p{Ui)+p{Si+i) = 
p{U,^i) > (1 - ^)p(OPT). Therefore max{p([/),p(5^ax)} > f (l " ^)p(OPT). 

□ 

2.1 The general, undirected 1-neighbour problem 

Here we formally show that stars are a viable family for undirected graphs and describe 
polynomial-time implementations of Best-Profit- Viable and Best- Ratio- Viable for 
the star family. Both oracles achieve an approximation ratio of (1 — ^) for any ^ > 0. Com- 
bined with Greedy- 1-Neighbour this yields a polynomial time ((1 - £)/2) • (1 - 
approximation for the general, undirected 1-neighbour problem. In addition, we show that 
this approximation is nearly tight by showing that the general, undirected 1-neighbour prob- 
lem generalizes many coverage problems including the max fc-cover and budgeted maximum 
coverage, neither of which have a (1 — 1/e + 6)-approximation for any e > unless P=NP. 

2.1.1 Stars 

For the rest of this section, we assume H is the family of star graphs {i.e. graphs composed 
of a center vertex u and a (possibly empty) set of edges all of which have u as an endpoint) 
so that given a graph G and a capacity fc, Best-Profit- Viable returns the highest profit, 
viable star with weight at most k and Best- Ratio- Viable returns the highest profit-to- 
weight, viable star with weight at most k. 

Lemma 5. The nodes of any undirected constraint graph G can be partitioned into 1- 
neighbour sets that are stars. 

Proof. Let Gi be an arbitrary connected component of G. If = 1 then V{Gi) is 

trivially a 1-neighbour set and the trivial star consisting of a single node is a spanning 
subgraph of Gi. If Gi is non-trivial then let T be any spanning tree of Gi and consider the 
following construction: while T contains a path P with |P| > 2, remove an interior edge of 
P from T. When the algorithm finishes, each path has at least one edge and at most two 
edges, so T is a set of non-trivial stars, each of which is a 1-neighbour set. □ 

Best-Profit- Viable Finding the maximum profit, viable star of a graph G subject to 
a knapsack constraint k reduces to the traditional unconstrained knapsack problem which 
has a well-known FPTAS that runs in 0{n^ /e) time [ZlIIS]. Every vertex v G V{G) defines 
a knapsack problem: the items are Ng{v) and the capacity is — w{v). Combining v with 
the solution returned by the FPTAS yields a candidate star. We consider the candidate 
star for each vertex and return the one with highest profit. Since we consider all possible 
star centers, Best-Profit- Viable runs in 0{n^ je) time and returns a viable star within 
a factor of (1 — 6:) of optimal, for any ^ > 0. 
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Best- Ratio- Viable We again turn to the FPTAS for the standard knapsack problem. 
Our goal is to find a high profit-to- weight star in G with weight at most k. The standard 
FPTAS for the unconstrained knapsack problem builds a dynamic programing table T with 
n rows and nP' columns where n is the number of available items and P' is the maximum 
adjusted profit over all the items. Given an item its adjusted profit is p'{v) = [(jy^^J 
where P is the true maximum profit over all the items. Each entry T[i^p\ gives the weight 
of the minimum weight subset over the first i items achieving profit p. 

Notice that, for any fixed profit p/T[n^p\ is the highest profit-to-weight ratio for that 
p. Therefore, for 1 < p < nP' ^ the p maximizing p/T[n^p] gives the highest profit-to-weight 
ratio of any feasible subset provided T[n^p\ < k. Let S be this subset. We will show that 
p{S)/w{S) is within a factor of (1 — e) of OPT where OPT is the profit-to-weight ratio of 
the highest profit-to- weight ratio feasible subset S'*. 

Letting r{v) = p{v)/w{v) and r\v) = p\v)/w{v)^ and following [15], we have 

r(5*) - {{s/n) . P) . < eP/w{S'') 

since, for any item the difference between p{v) and {{e/n) • P) • p\v) is at most {e/n) • P 
and we can fit at most n items in our knapsack. Because r'{S) > r^S"") and OPT is at least 
P/w^S"") we have 

r{S) > {e/n) • P • > r(5*) - eP/w{S'') > OPT - ^OPT = (1 - s)OPT. 

Now, just as with Best-Profit- Viable, every vertex v G V{G) defines a knapsack instance 
where Ng{V) is the set of items and k — w{v) is the capacity. We run the modified FPTAS 
for knapsack on the instance defined by v and add v to the solution to produce a set of 
candidate stars. We return the star with highest profit-to-weight ratio. Since we consider all 
possible star centers, Best-Ratio- Viable runs in 0{n^ /e) time and returns a viable star 
within a factor of (1 — e) of optimal, for any 6: > 0. 

Justifying Stars Besides some isolated vertices, our solution is a set of edges, but the 
edges are not necessarily vertex disjoint. Analyzing our greedy algorithm in terms of edges 
risks counting vertices multiple times. Partitioning into stars allows us to charge increases 
in the profit from the greedy step without this risk. In fact, stars are essentially the simplest 
structure meeting this requirement which is why we use them as our viable family. 

Improving the approximation ratio Often this style of greedy algorithm can be aug- 
mented with an "enumeration over triples" step to improve the ratio of {1 — e){l — 
However, such an enumeration would require enumerating over all possible triples of stars in 
our case. Doing so cannot be done in polynomial time, unless the graph has bounded degree. 

2.1.2 General, undirected 1-neighbour knapsack is APX-complete 

Here we show that it is NP-hard to approximate the general, undirected 1-neighbour knap- 
sack problem to within a factor better than 1 — 1/e + e for any e > via an approximation- 
preserving reduction from max fc-cover [2j . An instance of max fc-cover is a set cover instance 
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{S^ IZ) where is a ground set of n items and 7^ is a collection of subsets of S. The goal is 
to cover as many items in S using at most k subsets from IZ. 

Theorem 6. The general, undirected 1-neighhour knapsack problem has no (1 — 1/e + e)- 
approximation for any e > unless P=NP. 

Proof. Given an instance of (aS, TZ) of max fc-cover, build a bipartite graph G = {U U E) 
where U has a node Ui for each Si G S and V has a node Vj for each set Rj G TZ. Add the 
edge {ui^Vj} to E if and only if Ui G Rj. Assign profit p{ui) = 1 and weight w{ui) = for 
each vertex Ui G U and profit p{vj) = and weight w{ui) = 1 for each vertex vj G V. Since 
no pair of vertices in U have an edge and since every vertex in U has no weight, our strategy 
is to pick vertices from V and all their neighbours in U. Since every vertex of U has unit 
profit, we should choose the k vertices from V which collectively have the most neighbours. 
This is exactly the max fc-cover problem. 

□ 

The max fc-cover problem represents a class of budgeted maximum coverage (BMC) prob- 
lems where the elements in the base set have unit profit (referred to as weights in [lOj) and 
the cover sets have unit weight (referred to as costs in [IDj). In fact, one can use the above 
reduction to represent an arbitrary BMC instance: form the same bipartite graph, assign 
the element weights in BMC as vertex profits in [/, and finally assign the covering set costs 
in BMC as vertex weights in V. 

2.2 General, directed 1-neighbour knapsack is hard to approxi- 
mate 

Here we consider the 1-neighbour knapsack problem where G is directed and has arbitrary 
profits and weights. We show via a reduction from directed Steiner tree (DST) that the gen- 
eral, directed 1-neighbour problem is hard to approximate within a factor of l/Q{log^~^ n). 
Our result holds for DAGs. Because of this negative result, we also don't expect that good 
approximations exist for either Best-Profit- Viable and Best- Ratio- Viable for any 
family of viable graphs. 

In the DST problem on DAGs we are given a DAG G = {V^ E) where each arc has an 
associated cost, a subset of t vertices called terminals and a root vertex r . The goal is 
to find a minimum cost set of arcs that together connect r to all the terminals (i.e., the arcs 
form an out-arborescence rooted at r). For all ^ > 0, DST admits no log^~^ n-approximation 
algorithm unless NP C ZTIME[ri^''^^^''^''] [6j. This result holds even for very simple DAGs 
such as leveled DAGs in which r is the only root, r is at level 0, each arc goes from a vertex 
at level i to a vertex at level i + 1, and there are O(logn) levels. We use leveled DAGs in 
our proof of the following theorem. 

Theorem 7. The general, directed 1-neighbour knapsack problem is l/Q(log^~^ n) -hard to 
approximate unless NP C ZTIME[n^''^^^''^'']. 
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Proof. Let D be an instance of DST where the underlying graph G is a leveled DAG with a 
single root r. Suppose there is a solution to D of cost C. 

Claim 8. // there is an a -approximation algorithm for the general^ directed 1 -neighbour 
knapsack problem then a solution to D with cost 0{alogt) x C can be found where t is the 
number of terminals in D. 

Proof Let G = {V,A) be the DAG in instance D. We modify it to = {V',A') where 
we split each arc e G A hy placing a dummy vertex on e with weight equal to the cost 
of e according to D and profit of 0. In addition, we also reverse the orientation of each 
arc. Finally, all other vertices are given weight and terminals are assigned a profit of 1 
while the non-terminal vertices of G are given a profit of 0. We create an instance N of 
the general, directed 1-neighbour knapsack problem consisting of G^ and budget bound of 
C. By assumption, there is a solution to with cost C and profit t. Therefore given A^, 
an ^-approximation algorithm would produce a set of arcs whose weight is at most C and 
includes at least t/a terminals. That is, it has a profit of at least t/a. Set the weights 
of dummy nodes to on the arcs used in the solution. Then for all terminals included in 
this solution, set their profit to and repeat. Standard set-cover analysis shows that after 
0{alogt) repetitions, each terminal will have been connected to the root in at least one 
of the solutions. Therefore the union of all the arcs in these solutions has cost at most 
0{alogt) X C and connects all terminals to the root. □ 

Using the above claim, we'll show that if there is an ^-approximation algorithm for the 
general, directed- 1-neighbour problem then there is an 0(alogt)-approximation algorithm 
for DST which implies the theorem. Let L be the total cost of the arcs in the instance of 
DST. For each 2* < L, take C — 2^ and perform the procedure in the previous claim for 
alogt iterations. If after these iterations all terminals are connected to the root then call 
the cost of the resulting arcs a valid cost. Finally, choose the smallest valid cost, say and 

will be no more than 2Copt where Cqpt is the optimal cost of a solution for the DST 
instance. By the previous claim we have a solution whose cost is at most 2Copt x 0{a logt). 

□ 

3 The uniform, directed 1-neighbour knapsack prob- 
lem 

In this section, we give a PTAS for the uniform, directed 1-neighbour knapsack problem. 
We rule out an FPTAS by proving the following theorem. 

Theorem 9. The uniform^ directed 1-neighbour problem is strongly NP-hard. 

Proof The proof is a reduction from set cover. Let the base set for an instance be S' = 
{si, 52, ... , Sn} and the collection of subsets of S he TZ = i?2, . . . , Rm}- The maximum 
number of sets desired to cover the base set is t. 
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We build an instance of the 1-neighbour knapsack problem. Let M = n + 1. The 
dependency graph is as follows. For each subset Ri create a cycle Ci of size M; the set 
of cycles are pairwise vertex disjoint. In each such cycle Ci choose some node arbitrarily 
and denote it by q. For each Sj G define a new node in V and label it Vj. Define 
A = {{vj^Ci) : Sj ^ Ri}. Let the capacity of the knapsack be = tM + n. 

Suppose IZ' is a solution to the set-cover instance. Since 1 < \1Z'\ < t, we can define 
< p < t to he such that \7l^\ + p = t. Let TZ^^ = Ri(2)^ • • • , Ri(p)} be a collection of p 

elements of TZ not in Let be the graph induced by the union of the nodes in Cj for 
each Rj G TZ' or 7Z'^^ and {vi^V2^ . . . , Vn}' G' consists of exactly tM + n nodes. Every vertex 
in the cycles of G' has out-degree 1. Since IZ' is a set cover, for every Sj G S there is some 
Ri G IZ' where Sj G Ri and so the arc {vj^ Ci) is in G' . It follows that G' is a witness for a 
1-neighbour set of size k = tM + n. 

Now suppose that the subgraph G' of G is a solution to the 1-neighbour knapsack instance 
with value k. Since M > n, it is straightforward to check that G' must consist of a collection 
C of exactly t cycles, say C = {^^(i), ^^(2), • • • ^Ca(t)}^ and each node Vi^ I < i < along 
with some arc {vi^Ca(j-))- But by definition of G, that means that Si G Ra{ji) for 1 < i < n 
and so Ra(j2)^ • • • ? ^a(jn)} ^ solution to the set cover instance. □ 



3.1 A PTAS for the uniform, directed 1-neighbour problem. 

Let [/ be a 1-neighbour set. Let Au be a minimal set of arcs of G such that for every vertex 
u G 6g[Au]{^) ^ niin{5c;'(t^), 1}. That is, Au is a witness to the feasibility of U as a 
1-neighbour set. Since each node of U in Gfylf/] has out-degree or 1, the structure of Au 
has the following form. 

Property 10. Each connected component of G[Au] is a cycle C and a collection of vertex- 
disjoint in-arhorescences, each rooted at a node of C . C may he trivial, i.e., C may he a 
single vertex v, in which case Sg{v) = 0. 

For a strongly connected component X, let c{X) be the size of the shortest directed cycle 
in X with c{X) = 1 if and only if |X| = 1. 

Lemma 11. There is an optimal 1-neighhour knapsack U and a witness Au such that for 
each non-trivial, maximal SCO K of G, there is at most one cycle of Au in K and this cycle 
is a smallest cycle of K. 

Proof. First we modify Au so that it contains smallest cycles of maximal SCCs. We rely 
heavily on the structure of Au guaranteed by Property [TOj The idea is illustrated in Fig. [4} 



Let (7 be a cycle of Au and let K be the maximal SCC of G that contains C . Suppose 
C is not the smallest cycle of K or there is more than one cycle of Au in K. Let H be the 
connected component of Au containing C . Let C' be a smallest cycle of K. Let P be the 
shortest directed path from C to C' . Since C and C' are in a common SCC, P exists. Let 
T be an in-arborescence in G spanning P, C and H rooted at a vertex of C' . 
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(a) (b) 

Figure 4: Construction of a witness containing the smallest cycle of an SCC. The shaded 
region highlights the vertices of an SCC (edges not in C, C ^ or P are not depicted). The 
edges of the witness are solid, (a) The smallest cycle C is not in the witness, (b) By 
removing an edge from C and leaf edges from the in-arborescences rooted on (7, we create a 
witness that includes the smallest cycle C . 

Some vertices oiC'\JP might already be in the 1-neighbour set U: let X be these vertices. 



Note that X and V{H) are disjoint because of Property 10, Let T' be a sub-arborescence of 
T such that: 

• T' has the same root as T, and 

• \V{T'yjC)VJX\ = \V[H)\ + \X\. 

Since \V{T \J C')\ = \V{P U H U a)\ > \V{H)\ + \X\ and T U CMs connected, such an 
in-arborescence exists. 

Let B = {Au \ H) U U C . Let B' be a witness spanning V{B) contained in B that 
contains the arcs in C . We have that B' has \U\ vertices and contains a smallest cycle of K. 

We repeat this procedure for any SCC in our witness that contains a cycle of a maximal 
SCC of G that is not smallest or contains two cycles of a maximal SCC. □ 

To describe the algorithm, let T) = (S', F) be the DAG of maximal SCCs of G and let 
£ > 1/k he Si fixed constant where k is the knapsack bound. (If s < 1/k then the brute force 
algorithm which considers all subsets C V{G) with |y| < yields an acceptable bound 
for a PTAS.) 

We say that u G S is large if c{u) > ek^ petite if 1 < c{u) < ek^ or tiny if c{u) = 1. 
Let L, P, and T be the set of all large, petite and tiny SCCs respectively. Note that since 
6: > for every u G c{u) > sk > 1. 
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UNIFORM-DIRECTED- 1-NEIGHBOUR 
5 = 

For every subset X C L such that \X\ < 1/e 
Dx =V[PUX]. 

Z = {tiny sinks of V} U {petite sinks of Dx} 
P' = any maximal subset of Z such that c(P') + c{X) < k. 
U = {JxeP'uxi^i^) • is a smaUest cycle of K} 
Greedily add vertices to U such that U remains a 1-neighbour 
set until there are no more vertices to add or 
\U\ = k. (Via a backwards search rooted at [/.) 
B = argmax{|5|, \U\} 
Return B. 



Theorem 12. uniform-directed- 1-neighbour is a PTAS for the uniform, directed 1- 
neighhour knapsack problem. 

Proof. Let [/* be an optimal 1-neighbour knapsack and let Au* be its witness as guaranteed 



by Lemma [TT] Let C^V^ and T be the sets of large, petite, and tiny cycles in Au* respectively. 
By Lemma [TT} each of these cycles is in a different maximal SCC and each cycle is a smallest 
cycle in its maximal SCC. 

Let C = {Li, . . . , Li} and let L* be the set of large SCCs that intersect Li, . . . , L^. Note 
that |L*| = i. Since k > \U*\ > Yli=i ^ have £<!/£. So, in some iteration 

of UNIFORM-DIRECTED- 1-NEIGHBOUR, X = L*. We analyze this iteration of the algorithm. 
There are two cases: 

P^ = Z. First we show that every vertex in [/* has a descendant in X U P^ Clearly if a 
vertex of [/* has a descendant in some G >C, it has a descendant in X. Suppose a 
vertex of [/* has a descendant in some Pi G V. Pi is within an SCC of Dx^ and so 
it must have a descendant that is in a sink of Dx- Similarly, suppose a vertex of [/* 
has a descendant in some G T. Ti is either a sink in T> or has a descendant that 
is either a sink of P or a sink of Dx- All these sinks are contained in X \J P' . Since 
every vertex of [/* can reach a vertex in X U P', greedily adding to this set results in 
\U\ = |[/*| and the result of uniform-directed- 1-neighbour is optimal. 

P' 7^ Z. For any sink x ^ P\ c{P') + c{X) + c{x) > k but c(x) < ek by the definition of 
tiny and petite. So, \U\ > c{P') + c{X) > (1 — e)k^ and the resulting solution is within 
{1 — e) of optimal. 

The running time of uniform-directed- 1-neighbour is n^^^/^^. It is dominated by 
the number of iterations, each of which can be executed in poly time. □ 
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4 The uniform, undirected 1-neighbour problem 



We now consider the final case of 1-neighbour problems, namely the uniform, undirected 1- 
neighbour problem. We note that there is a relatively straightforward linear time algorithm 
for finding an optimal solution for instances of this problem. The algorithm essentially 
breaks the graph into connected components and then, using a counting argument, builds 
an optimal solution from the components. 

Theorem 13. The uniform^ undirected 1-neighbour knapsack problem has a linear-time 
solution. 

Proof. Let Q — (^1,^2, - - - ^Gt) be the connected components of the dependency graph G 
in decreasing order by size (we can find such an ordering in linear time). Note that each 
connected component Qj constitutes a feasible set for the uniform, undirected 1-neighbour 
problem on G. If k is odd and \Gj \ = 2 for all then the optimal solution has size k — 1 since 
no vertex can be included on its own. In this case the first [A;/2j connected components 
constitutes a feasible, optimal solution. 

Otherwise, let i be smallest index such that Yl]=i \Gj\ > k. If z = 1 then let <S = 0. 
Otherwise, take S — X^^C^ \Qj\. If S = k then the first i — 1 components of G have exactly k 
nodes and constitute a feasible, optimal solution for G. Otherwise, by our choice of i^ S < k 
and \Gi\ > k — S. Let U = (t^i,t^2, • • • ^'^l^^l) be an ordering of the nodes in Qi given by a 
breadth- first search (start the search from an arbitrary node) . Collect the first k — S nodes 
of u in U — {ui \ l < k — S}. We consider three cases: 

1. If = 1 and \Qt\ = 1, then the first i — 1 connected components along with Qt 
constitute a feasible, optimal solution. 

2. If |[/| = 1 and \Gt\ 7^ 1, then |^i| > 2. If = 1 then return since there is no feasible 
solution, otherwise drop an appropriate node from Qi (one that keeps the rest of Qi 
connected) and add U2 to U since \Qi\ > 1. Now the first i — 1 connected components 
(without the one node in Qi) along with U constitute a feasible, optimal solution. 

3. lf\U\ > 1, then the first i — 1 connected components along with U constitute a feasible, 
optimal solution. 

□ 

5 The all-neighbours knapsack problem 

In this section, we consider the all-neighbours knapsack problem. Our primary result is 
a PTAS for the uniform, directed all-neighbours problem. We also show that uniform, 
directed all-neighbours is NP-hard in the strong sense, so no polynomial-time algorithm can 
yield a better approximation unless P=NP. In addition, we show that uniform, undirected 
all-neighbours knapsack reduces to the classic knapsack problem. 
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A set of vertices [/ is a feasible all-neighbours knapsack solution if, for every vertex u ^ 
Ng{u) C U. Recall that for an SCC c G V{V) is obtained by contracting V{c) C V{G). For 
convenience, let w{c) = w{V{c)) and p{c) = p{V{c)). Let S = {de8CT,{u) \ u G V{V)} be the 
set of descendant sets for every node of V. We now show that all feasible solutions to the 
all-neighbour knapsack problem can be decomposed into sets from S. 

Property 14. Every feasible solution to a general^ directed all-neighbour instance has the 
form UueQ^i"^) "^here Q C S. 

Proof Let [/ be a feasible solution for the dependency graph G. We claim that iiu G U then 
there exists a set G 5 such that u G V{S) and V{S) C U. Notice that the all-neighbours 
constraint implies that if 6 is a neighbor of a in G and c is a neighbor of 6 in G, then a G U 
implies c G U. Thus, by transitivity, if a G t/ and b is reachable from a then b ^ U. Let 
u ^ U and v be the node in V such that u G V{v). Suppose that w G de8CT){v). Then every 
node in V{w) is reachable from in G as is every node in l/(descp('u)) so V{de8CT>{v) ^ U 
which proves the claim since de8CT,{v) G S. The property follows. □ 

Property [14] tells us that if [/ is a feasible solution for G and u ^ U ^ then every node 
reachable from u'mG must also be in the optimal solution. We use this property extensively 
throughout the rest of Section [5| 

5.1 The uniform, directed all-neighbour knapsack problem 

We show that uniform-directed- all-neighbour (below) is a PTAS for the uniform, 
directed all-neighbours knapsack problem. The key ideas are to (a) identify a set A of heavy 
nodes in V{V) i.e., those nodes v where w{v) > e/c, and then (b) augment subsets of the 
heavy nodes with nodes from the set B of light nodes^ i.e., those nodes v with w{v) < ek. We 
note that this algorithm works on the set of SCCs and can handle the slightly more general 
than uniform case: that in which the weight and profit of a vertex is equal, but different 
vertices may have diflFerent weights. 



UNIFORM-DIRECTED- ALL-NEIGHBOUR 

A = {ve V{V) I w{v) > ek}, B = S\A, X = 
For every subset A of A such that 1^4' | < 1/e 
T = descp(AO 

Let = {v\v e Bn {V{V) \ T) and Nv{v) C T} 
While w{T)<k and B' ^ ^ 

Add any element b ^ B' to T. 

Update B' = {v\v eBr\ {V{V) \ T) and Nv{v) C T} 
If W{V{T)) > W{X) then X = V{T) 
Return X 
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Theorem 15. uniform-directed- all-neighbour is a PTAS for the uniform, directed 
all-neighbour knapsack problem. 



Proof Let U* be a set of vertices of G forming an optimal solution to the uniform, directed 
all-neighbours knapsack problem. By Property [T4| there is a subset of nodes Q"" C V such 
that [/* = UueQ^yi"^)- Let yl* = [/* n A. Since the size of any node in A is at least ek and 
the weight of [/* is at most fc, 1^4*1 < l/e. Since all subsets of A of size at most 1/e are 
considered in the for loop of uniform-directed- all-neighbours, set A* will be one such 
set. 

Let L^* = desc(74*). Let B be all the nodes of V added to the solution in all iterations of 
the while loop. Let T"" = D"" U B. 



Since ^* C [/*, L)* C [/* by Property 14 Let B"" = U"" \ D\ B and 5* are not 



necessarily the same set of nodes. Suppose B and S* are not the same set of nodes and 
w{T*) < (1 — e)w{U*). Then there is a node u G B* \B such that u^s neighbours are in T*. 
Since w{u) < e/c, u could be added to 5, a contradiction. 

We now bound the running time of uniform-directed- all-neighbour. Line 1, which 
find the set of heavy nodes A C V{V)^ compute a simple set difference, and initialize the 
return value, take at most 0{n) time. Since \A\ < ^ and \A'\ < 1/e there are at most 
{{J J < {n/ekY^^ subsets of A considered in line 2, so line 2 executes at most {n/ekY^^ 

times. Since we will never execute line 4 more than n times we have an 0(n^^(^/^^)-time 
algorithm. □ 

Theorem 16. The uniform, directed- all-neighbour problem is NP-hard. 

Proof. We reduce the set-union knapsack problem to the uniform, directed all-neighbours 
knapsack problem. An instance of SUKP consists of a base set of elements S = {xi, X2, . . . , x^} 
where each Xi has an integer weight Wi^ a positive integer capacity c, a target profit rf, a col- 
lection C = {Si^ . . . ^ Sm} where Si C each subset Si has a non-negative profit pi. Then 
the question asked is: Does there exist a sub-collection = {Si^^ Si^^ . . . ^ Si^} of C such that 
Yl]=iPij ^ d and for T = U^j^-^Si.^ ^xs^t^s ^ c. This problem is known to be NP-hard in 
the strong sense even for the case where Wi = pi = 1 and \Si\ = 2 for 1 < i < m [3\. 

We consider instances of SUKP where every subset Sj in C has cardinality 2 and profit 
Pj = 1. Also, each element Xi has weight Wi = 1. Let c be the capacity and d be the target 
profit. Given such an instance of SUKP we define next an instance of uniform, directed 
all-neighbours that has a solution if and only if the SUKP instance has a solution. 

Let G = {V^A)hedi directed graph where for each element Xi there is a strongly connected 
component scci with M = d + 1 nodes one of which is labeled Zi. Let Ui denote the set of 
nodes in scCi. For each subset Sj there is a node Vj G V. For every Xi G Sj there is an arc 
(vj^Zi) G A and these are the only other arcs. Let k = cM + d he the target party size. 
Then we claim that there is a party of size k if and only if there is a solution to the SUKP 
instance having weight at most c and profit at least d. 

Suppose there is solution P of size k to uniform, directed all-neighbours. Since k = 
cM + d and M > d^ there must be some collection K of node sets Ui of strongly connected 
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components such that P contains the union of nodes of the [/^'s in K where \K\ < c. Hence 
P must also contain a set Z of at least d nodes Vj. Since P is feasible solution it must be 
that for every Vj ^ Z ii Xi ^ Sj then Ui G P. It is straightforward then to check that the 
collection of sets C = {Sj ' Vj G is a solution to the SUKP instance with profit d > \Z\ 
and since Uy^^zSj = {xi : Ui G K}) it has weight at most c. 

Now suppose = {Sj^^ Sj^^ . . . ^ Sj^} is a solution to the SUKP instance where t > d and 
I ^1=1 Sj^\ < c. Let = Ul^iSj^ and hence |A^| < c. Arbitrarily choose some K <ZC' where 
\K\ = d. Then take P' = {vj \ Sj G K}. Let A^' be a set of elements such that N C and 
|A^'| = c. Define P'^ = Ux^eN'Ui. Since K C C', it must be for every Vj G if Xi G Sj 
then Ui C Therefore P = P'[JP'' is a solution to the all-neighbours problem where 
\P\=cM + d. □ 



5.2 The uniform, undirected all-neighbour knapsack problem 

The problem of uniform, undirected all-neighbour knapsack is solvable in polynomial time. 
In this case we just need to find the subset of connected components of G whose total size is 
as large as possible without exceeding k. But this is exactly the subset sum problem. Since 
k < the standard dynamic programming algorithm yields a truly polynomial-time 0{nk) 
solution. 



5.3 The general, all-neighbour knapsack problem 



As mentioned in Section |1.2| the general, directed, all-neighbours knapsack problem is a 
generalization of the partially ordered knapsack problem [llj which has been shown to be 
hard to approximate within a 2^°^ ^ factor unless 3SATgDTIME(2^^ ^^^) [5j. Hence the 
general, directed all-neighbours knapsack problem is hard to approximate within this factor 
under the same complexity assumption. 

In the undirected case, i.e., the case where the dependency graph G is undirected, V 
becomes a set of disjoint nodes, one for each connected component of G, and S = V{V). By 



Property 14, we are left with the problem of finding a subset of nodes Q C V{V) such that 
p{Q) is maximal subject to w{Q) < k. But this is exactly the 0-1 knapsack problem which 
has a well-known FPTAS. Thus, general, undirected all-neighbours also has an FPTAS. 
Contrast this with the uniform, directed all-neighbours problem. There, the sets in S are 
not disjoint, so we cannot use the 0-1 knapsack ideas. 



6 Future directions 

There are several open problems to consider, including closing gaps, improving the running 
times of the PTASes, and giving approximation algorithms for the general, directed versions 
of both 1-neighbour and all-neighbour. We believe that fully understanding these problems 
will lead to ideas for a much more general problem: maximizing a linear function with a 
submodular constraint. 
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